Efficient Training of Graph-Regularized Multitask SVMs
نویسندگان
چکیده
We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20, 000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss functions, and derive its dual representation. Building on the work of Hsieh et al. [1, 2], we derive an algorithm for optimizing the large-margin objective and prove its convergence. Our computational experiments show a speedup of up to three orders of magnitude over LibSVM and SVMLight for several standard benchmarks as well as challenging data sets from the application domain of computational biology. Combining our optimization methodology with the COFFIN large-scale learning framework [3], we are able to train a multi-task SVM using over 1,000,000 training points stemming from 4 different tasks. An efficient C++ implementation of our algorithm is being made publicly available as a part of the SHOGUN machine learning toolbox [4].
منابع مشابه
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
Deep multitask networks, in which one neural network produces multiple predictive outputs, are more scalable and often better regularized than their single-task counterparts. Such advantages can potentially lead to gains in both speed and performance, but multitask networks are also difficult to train without finding the right balance between tasks. We present a novel gradient normalization (Gr...
متن کاملSparse Representation for Detection of Microcalcification Clusters
We present an approach to detect MCs in mammograms by casting the detection problem as finding sparse representations of test samples with respect to training samples. The ground truth training samples of MCs in mammograms are assumed to be known as a priori. From these samples of the interest object class, a vocabulary of information-rich object parts is automatically constructed. The sparse r...
متن کاملAdaptive Margin Support Vector Machines for Classi
In this paper we propose a new learning algorithm for classiication learning based on the Support Vector Machine (SVM) approach. Existing approaches for constructing SVMs 12] are based on minimization of a regularized margin loss where the margin is treated equivalently for each training pattern. We propose a reformulation of the minimization problem such that adaptive margins for each training...
متن کاملUnifying Adversarial Training Algorithms with Data Gradient Regularization
Many previous proposals for adversarial training of deep neural nets have included directly modifying the gradient, training on a mix of original and adversarial examples, using contractive penalties, and approximately optimizing constrained adversarial objective functions. In this article, we show that these proposals are actually all instances of optimizing a general, regularized objective we...
متن کاملMultitask Feature Selection with Task Descriptors
Machine learning applications in precision medicine are severely limited by the scarcity of data to learn from. Indeed, training data often contains many more features than samples. To alleviate the resulting statistical issues, the multitask learning framework proposes to learn different but related tasks jointly, rather than independently, by sharing information between these tasks. Within th...
متن کامل